Analysis of Extended Performance for clustering of Satellite Images Using Bigdata Platform Spark
نویسنده
چکیده
Due to the recent emergence Clustering techniques have been widely adopted in many real world data analysis applications, such as customer behavior analysis, targeted marketing, digital forensics, etc. As the satellite imagery is getting generated at a higher rate than the previous decades, it becomes essential to have better solutions in terms of accuracy as well as performance. In this paper, we are proposing the solution over big data which performs the clustering of images using different methods viz. Scalable Kmeans++, Bisecting Kmeans and Gaussian Mixture. Since the number of clusters is not known in advance in any of the methods, we also propose a better approach of validating the number of clusters using Simple Silhouette Index algorithm and thus to provide the better clustering possible. Keyword: Images, Distributed Processing, Scalable Kmeans++, K-means Clustering, Bigdata, Datamining, Security, Gaussian Mixture
منابع مشابه
Data Clustring Using A New CGA(Chaotic-Generic Algorithm) Approach
Clustering is the process of dividing a set of input data into a number of subgroups. The members of each subgroup are similar to each other but different from members of other subgroups. The genetic algorithm has enjoyed many applications in clustering data. One of these applications is the clustering of images. The problem with the earlier methods used in clustering images was in selecting in...
متن کاملData Clustring Using A New CGA(Chaotic-Generic Algorithm) Approach
Clustering is the process of dividing a set of input data into a number of subgroups. The members of each subgroup are similar to each other but different from members of other subgroups. The genetic algorithm has enjoyed many applications in clustering data. One of these applications is the clustering of images. The problem with the earlier methods used in clustering images was in selecting in...
متن کاملOn the usability of Hadoop MapReduce, Apache Spark & Apache flink for data science
Distributed data processing platforms for cloud computing are important tools for large-scale data analytics. Apache Hadoop MapReduce has become the de facto standard in this space, though its programming interface is relatively low-level, requiring many implementation steps even for simple analysis tasks. This has led to the development of advanced dataflow oriented platforms, most prominently...
متن کاملObject-Oriented Method for Automatic Extraction of Road from High Resolution Satellite Images
As the information carried in a high spatial resolution image is not represented by single pixels but by meaningful image objects, which include the association of multiple pixels and their mutual relations, the object based method has become one of the most commonly used strategies for the processing of high resolution imagery. This processing comprises two fundamental and critical steps towar...
متن کاملAsphalt Pavement Performance Model of Airport Using Microwave Remote Sensing Satellite
The purpose of this study is to build the binary logit model of an airport pavement that could monitor the pavement condition in near real time using microwave remote sensing satellite, then the relationship between the international roughness index (IRI) of an airport and backscattering values from PALSAR images of the ALOS satellite was determined. Total 390 data were used in analysis. This m...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017